Search CORE

46 research outputs found

PI-FLAME: A parallel immune system simulator using the FLAME graphic processing unit environment

Author: Bernaschi M
Castiglione F
Chiacchio F
DSouza RM
DSouza RM
Eilenberg S.
Green S.
Janeway C
Paul Richmond
Richmond P
Roshan M D’Souza
Satish N
Shailesh Tamrakar
Van Dyke Parunak H
Publication venue: 'SAGE Publications'
Publication date
Field of study

Accelerating the Gillespie τ-Leaping Method Using Graphics Processing Units

Author: B Nathan
C Dittamo
DB Kirk
DT Gillespie
DT Gillespie
DT Gillespie
H Li
H Li
Ivan Komarov
J Hoberock
J Nickolls
J Sanders
JD Owens
JE Stone
JM McCollum
Jose-Juan Tapia
Jörg Langowski
K Burrage
KR Sanft
L Howes
L Macchiarulo
M Rathinam
M Vigelius
MA Gibson
NG van Kampen
Roshan M. D’Souza
T Tian
T Tian
TE Turner
TP Schulze
Y Cao
Y Cao
Publication venue: Public Library of Science
Publication date: 01/06/2012
Field of study

The Gillespie τ-Leaping Method is an approximate algorithm that is faster than the exact Direct Method (DM) due to the progression of the simulation with larger time steps. However, the procedure to compute the time leap τ is quite expensive. In this paper, we explore the acceleration of the τ-Leaping Method using Graphics Processing Unit (GPUs) for ultra-large networks ( reaction channels). We have developed data structures and algorithms that take advantage of the unique hardware architecture and available libraries. Our results show that we obtain a performance gain of over 60x when compared with the best conventional implementations

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

FigShare

Dense Descriptors for Optical Flow Estimation: A Comparative Study

Author: Ahmadreza Baghaie
Roshan M. D’Souza
Zeyun Yu
Publication venue: 'MDPI AG'
Publication date: 01/02/2017
Field of study

Estimating the displacements of intensity patterns between sequential frames is a very well-studied problem, which is usually referred to as optical flow estimation. The first assumption among many of the methods in the field is the brightness constancy during movements of pixels between frames. This assumption is proven to be not true in general, and therefore, the use of photometric invariant constraints has been studied in the past. One other solution can be sought by use of structural descriptors rather than pixels for estimating the optical flow. Unlike sparse feature detection/description techniques and since the problem of optical flow estimation tries to find a dense flow field, a dense structural representation of individual pixels and their neighbors is computed and then used for matching and optical flow estimation. Here, a comparative study is carried out by extending the framework of SIFT-flow to include more dense descriptors, and comprehensive comparisons are given. Overall, the work can be considered as a baseline for stimulating more interest in the use of dense descriptors for optical flow estimation

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Dynamic Denoising and Gappy Data Reconstruction Based on Dynamic Mode Decomposition and Discrete Cosine Transform

Author: Ahmadreza Baghaie
Ali Bakhshinejad
Mojtaba F. Fathi
Roshan M. D’Souza
Publication venue: 'MDPI AG'
Publication date: 01/09/2018
Field of study

Dynamic Mode Decomposition (DMD) is a data-driven method to analyze the dynamics, first applied to fluid dynamics. It extracts modes and their corresponding eigenvalues, where the modes are spatial fields that identify coherent structures in the flow and the eigenvalues describe the temporal growth/decay rates and oscillation frequencies for each mode. The recently introduced compressed sensing DMD (csDMD) reduces computation times and also has the ability to deal with sub-sampled datasets. In this paper, we present a similar technique based on discrete cosine transform to reconstruct the fully-sampled dataset (as opposed to DMD modes as in csDMD) from sub-sampled noisy and gappy data using l 1 minimization. The proposed method was benchmarked against csDMD in terms of denoising and gap-filling using three datasets. The first was the 2-D time-resolved plot of a double gyre oscillator which has about nine oscillatory modes. The second dataset was derived from a Duffing oscillator. This dataset has several modes associated with complex eigenvalues which makes them oscillatory. The third dataset was taken from the 2-D simulation of a wake behind a cylinder at Re = 100 and was used for investigating the effect of changing various parameters on reconstruction error. The Duffing and 2-D wake datasets were tested in presence of noise and rectangular gaps. While the performance for the double-gyre dataset is comparable to csDMD, the proposed method performs substantially better (lower reconstruction error) for the dataset derived from the Duffing equation and also, the 2-D wake dataset according to the defined reconstruction error metrics

Directory of Open Access Journals

-NN using sorting.

Author: Ali Dashti (461880)
Ivan Komarov (120593)
Roshan M. D’Souza (159250)
Publication venue
Publication date
Field of study

Here we illustrate the -NN search for 3 vectors. The distance matrix is stored along with the column and row indices in a row-major format. First, we sort the entire distance matrix with the distance as the key. The result is next sorted in a stable manner first with the column as the index and then as separately with the row as the index. We then pick the closest distances both for the columns and the rows results.</p

FigShare

Processing local -NNs within nodes.

Author: Ali Dashti (461880)
Ivan Komarov (120593)
Roshan M. D’Souza (159250)
Publication venue
Publication date
Field of study

The sub-problem assigned to a node is finding the row and column -NNs w.r.t . is divided into partitions. All partitions are processed by GPU . The row -NNs are processed within GPU memory and the merged results are written to CPU RAM. The column -NNs are written to CPU RAM. Later, each of the local column -NNs are merged by a single GPU.</p

FigShare

Processing global -NNs.

Author: Ali Dashti (461880)
Ivan Komarov (120593)
Roshan M. D’Souza (159250)
Publication venue
Publication date
Field of study

In this figure node is responsible for calculating the global -NNs of all vectors that are in . This is done by computing and merging local row -NNs of . Note that the local row -NNs w.r.t. have already been calculated when node 4 calculated local block-column -NNs w.r.t. . The merged results are stored in a heap. The -NNs w.r.t are cooperatively computed by all nodes. For example, node successively computes and merges -NNs w.r.t. . It then transmits the results to node , which receives results from other nodes and does a global merge.</p

FigShare

Summation kernel.

Author: Ali Dashti (461880)
Ivan Komarov (120593)
Roshan M. D’Souza (159250)
Publication venue
Publication date
Field of study

Calculation of every row of involves and one element of per row. Therefore, each thread loads an element of into a register. These data are reused to compute all rows of . Next, one thread per block reads the corresponding element of into shared memory. Next, each thread reads an element of and adds to it the element of , which is in the register and the into shared memory to generate the corresponding element of .</p

FigShare

Performance benchmarks for multi-GPU execution.

Author: Ali Dashti (461880)
Ivan Komarov (120593)
Roshan M. D’Souza (159250)
Publication venue
Publication date
Field of study

In this test we used 2<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0074113#pone.0074113-Arefin1" target="_blank">[24]</a> the 2 GPUs (Tesla 2050) were mounted on a single desktop machine. For our implementation, we use 2 nodes in our GPU cluster and opted to use only one GPU per node. The input data had the dimension , and the number of closest neighbors .</p

FigShare